412 research outputs found

    A fast algorithm for detecting gene-gene interactions in genome-wide association studies

    Full text link
    With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS771 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Higher Order Dynamics in the Replicator Equation Produce a Limit Cycle in Rock-Paper-Scissors

    Full text link
    Recent work has shown that pairwise interactions may not be sufficient to fully model ecological dynamics in the wild. In this letter, we consider a replicator dynamic that takes both pairwise and triadic interactions into consideration using a rank-three tensor. We study {these} new nonlinear dynamics using a generalized rock-paper-scissors game whose dynamics are well understood in the {standard} replicator sense. We show that the addition of higher-order dynamics leads to the creation of a subcritical Hopf bifurcation and consequently an unstable limit cycle. It is known that this kind of behaviour cannot occur in the pairwise replicator in any three strategy games, showing the effect higher-order interactions can have on the resulting dynamics of the system. We numerically characterize parameter regimes in which limit cycles exist and discuss possible ways to generalize this approach to studying higher-order interactions.Comment: 7 pages, 4 figures - Reference added from Version

    A joint model for nonparametric functional mapping of longitudinal trajectory and time-to-event

    Get PDF
    BACKGROUND: The characterization of the relationship between a longitudinal response process and a time-to-event has been a pressing challenge in biostatistical research. This has emerged as an important issue in genetic studies when one attempts to detect the common genes or quantitative trait loci (QTL) that govern both a longitudinal trajectory and developmental event. RESULTS: We present a joint statistical model for functional mapping of dynamic traits in which the event times and longitudinal traits are taken to depend on a common set of genetic mechanisms. By fitting the Legendre polynomial of orthogonal properties for the time-dependent mean vector, our model does not rely on any curve, which is different from earlier parametric models of functional mapping. This newly developed nonparametric model is demonstrated and validated by an example for a forest tree in which stemwood growth and the time to first flower are jointly modelled. CONCLUSION: Our model allows for the detection of specific QTL that govern both longitudinal traits and developmental processes through either pleiotropic effects or close linkage, or both. This model will have great implications for integrating longitudinal and event data to gain better insights into comprehensive biology and biomedicine

    Semiparametric Quantitative-Trait-Locus Mapping: I. on Functional Growth Curves

    Get PDF
    The genetic study of certain quantitative traits in growth curves as a function of time has recently been of major scientific interest to explore the developmental evolution processes of biological subjects. Various parametric approaches in the statistical literature have been proposed to study the quantitative-trait-loci (QTL) mapping of the growth curves as multivariate outcomes. In this article, we view the growth curves as functional quantitative traits and propose some semiparametric models to relax the strong parametric assumptions which may not be always practical in reality. Appropriate inference procedures are developed to estimate the parameters of interest which characterise the possible QTLs of the growth curves in the models. Recently developed multiple comparison testing procedures are applied to locate the statistically meaningful QTLs. Numerical examples are presented with simulation studies and analysis of real data

    Systems Mapping: How to Improve the Genetic Mapping of Complex Traits Through Design Principles of Biological Systems

    Get PDF
    Background: Every phenotypic trait can be viewed as a “system” in which a group of interconnected componentsfunction synergistically to yield a unified whole. Once a system’s components and their interactions have beendelineated according to biological principles, we can manipulate and engineer functionally relevant components toproduce a desirable system phenotype.Results: We describe a conceptual framework for mapping quantitative trait loci (QTLs) that control complex traitsby treating trait formation as a dynamic system. This framework, called systems mapping, incorporates a system ofdifferential equations that quantifies how alterations of different components lead to the global change of traitdevelopment and function through genes, and provides a quantitative and testable platform for assessing theinterplay between gene action and development. We applied systems mapping to analyze biomass growth data ina mapping population of soybeans and identified specific loci that are responsible for the dynamics of biomasspartitioning to leaves, stem, and roots.Conclusions: We show that systems mapping implemented by design principles of biological systems is quiteversatile for deciphering the genetic machineries for size-shape, structural-functional, sink-source and pleiotropicrelationships underlying plant physiology and development. Systems mapping should enable geneticists to shedlight on the genetic complexity of any biological system in plants and other organisms and predict itsphysiological and pathological states

    A multilocus likelihood approach to joint modeling of linkage, parental diplotype and gene order in a full-sib family

    Get PDF
    BACKGROUND: Unlike a pedigree initiated with two inbred lines, a full-sib family derived from two outbred parents frequently has many different segregation types of markers whose linkage phases are not known prior to linkage analysis. RESULTS: We formulate a general model of simultaneously estimating linkage, parental diplotype and gene order through multi-point analysis in a full-sib family. Our model is based on a multinomial mixture model taking into account different diplotypes and gene orders, weighted by their corresponding occurring probabilities. The EM algorithm is implemented to provide the maximum likelihood estimates of the linkage, parental diplotype and gene order over any type of markers. CONCLUSIONS: Through simulation studies, this model is found to be more computationally efficient compared with existing models for linkage mapping. We discuss the extension of the model and its implications for genome mapping in outcrossing species

    Semiparametric Quantitative-Trait-Locus Mapping: II. on Censored Age-at-Onset

    Get PDF
    In genetic studies, the variation in genotypes may not only affect different inheritance patterns in qualitative traits, but may also affect the age-at-onset as quantitative trait. In this article, we use standard cross designs, such as backcross or F2, to propose some hazard regression models, namely, the additive hazards model in quantitative trait loci mapping for age-at-onset, although the developed method can be extended to more complex designs. With additive invariance of the additive hazards models in mixture probabilities, we develop flexible semiparametric methodologies in interval regression mapping without heavy computing burden. A recently developed multiple comparison procedures is adapted to identify the QTL in dense maps. The proposed methodologies will be evaluated by simulation studies and demonstrated in an actual data analysis of forest tree growth

    A General Quantitative Genetic Model for Haplotyping a Complex Trait in Humans

    Get PDF
    Uncertainty about linkage phases of multiple single nucleotide polymorphisms (SNPs) in heterozygous diploids challenges the identification of specific DNA sequence variants that encode a complex trait. A statistical technique implemented with the EM algorithm has been developed to infer the effects of SNP haplotypes from genotypic data by assuming that one haplotype (called the risk haplotype) performs differently from the rest (called the non-risk haplotype). This assumption simplifies the definition and estimation of genotypic values of diplotypes for a complex trait, but will reduce the power to detect the risk haplotype when non-risk haplotypes contain substantial diversity. In this article, we incorporate general quantitative genetic theory to specify the differentiation of different haplotypes in terms of their genetic control of a complex trait. A model selection procedure is deployed to test the best number and combination of risk haplotypes, thus providing a precise and powerful test of genetic determination in association studies. Our method is derived on the maximum likelihood theory and has been shown through simulation studies to be powerful for the characterization of the genetic architecture of complex quantitative traits

    Genetic mapping of complex traits by minimizing integrated square errors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic mapping has been used as a tool to study the genetic architecture of complex traits by localizing their underlying quantitative trait loci (QTLs). Statistical methods for genetic mapping rely on a key assumption, that is, traits obey a parametric distribution. However, in practice real data may not perfectly follow the specified distribution.</p> <p>Results</p> <p>Here, we derive a robust statistical approach for QTL mapping that accommodates a certain degree of misspecification of the true model by incorporating integrated square errors into the genetic mapping framework. A hypothesis testing is formulated by defining a new test statistics - energy difference.</p> <p>Conclusions</p> <p>Simulation studies were performed to investigate the statistical properties of this approach and compare these properties with those from traditional maximum likelihood and non-parametric QTL mapping approaches. Lastly, analyses of real examples were conducted to demonstrate the usefulness and utilization of the new approach in a practical genetic setting.</p
    corecore